R. L KALMAN 

ilute for Advoncod Study,' 


A New Approach to Linear Filtering 
and Prediction Problems 1 

The classical filtering, and prediction problem is re-examined using the Bode-Shannon 
representation of random processes and the "slale-iransUion" method of analysis of 
■ tnam'u <slt -,. / esutts at 

(0 formukUi . > < ' i f ti tm appi itiwiitn 

• '. i wing- memory and 

memory fillers. > 

(2) A nonlinear difference (or differential) equation is derived for the comriattce 
matrix of the optimal estimation error, From the solution of this equation the co- 

Iff i,t', e optimal linear filter are ob- 

i • , 

(3) The filtering problem is shown to be the dual of the noise-free regulator problem. 
'< > nm • ' • , ' two i n own pr obit »;$ , nj » ttg 

and extending earlier results. 

Tli ■ ■ > t I on lined and proceed to irs principles; basic 

concepts of the theory of random processes are reviewed in the Appendix. 


Introduction 


f class of theoretical and practical 
•* communication i»4 control is of a statistics! nature. 

•iii rundot ntds; tit) separa- 
ted, > signals t'tumi uiuui noise; ih) detection of signals 
1 pulsus mt o Is ..i ... [,u it' random noise. 
. pioneering v»«fk, U .. i i S hat problems a) 
■ad ihii jo --ailed Wiener-ffopf integral equation, he 
». nod •; i 1 ' i tH t 

1i >.i » i i . i ,»< i ii ^ <,t 

■>y "islica and rational spectra. 

ii . ,i Wipm t - , 

■ " ■<! hagsusismi solved the fiaite-mmioiy <ase |2 

i . i- i in iimlftoii 
■ve a simplified method [2] of solution. Booton dis- 
the r^nsfationary Wiem-r-Hopf equation [4|. The*-; 
• .•.nm. • 1 texts |S-6]. A somewhat different 
tt ... „». a „. ], „ i)tifln K1TC „ receut lv hv 
I iir > i <, , i 

m |8 i, l<em [y|. Another approach based on the eigen- 
ns - n ner-Hoj « () , n (wi « 
t»:..-- probieu s ' i ni m (' I- i p m 

! s pioneered b> in < I ipphod by m, nv 
e.g., Shinbrot [11 J, Biuro t2J iga< > ,., ,. 

ii these 


•orks.. l 


ohj 


to obtain the specification of 
(Wiener filter) which accomplishes the 


1959, 


in brackets designate References at end of paper. 

f i ma, l,u .loin- t . ft • , • ii 

<■ ■« It'tk or iiothmc i i . , , 

i ,n il , i [ Mti dlj i these nonlinear filters. 

ua ! i;e - . . ' 

• i f, itor • 


E AilKl 


oio.i u, v „•-. ,: ^ ■ -.,„., ., G f t t,., authors an,! n ,t ti 

'» 1' i .'» hUMi. ' ' ' 


1 s ii ti » d i i iig the Wit i ' i t 

a numbet of lini i ! their practical 

usefulness: 

(!) The optimal filter is specified by its impulse response. It 
;= not a hi tuple task to synthesize the filter from such data. 

(2) Numerif i rro ii i . H ttii impulse response 
is often quite involved and poorly suited to machine computa- 
tion. The situation gehs rapidly worse with increasing complex- 
ity of the problem. 

(3) Important generalizations (e.g., growing-memory filters, 
nonstationary prediction) require new derivations, frequently 

oi Considerable difficulty to tin; noii.-Mieriiilfck 

.(4) The mathematics of the durivatioibj arc not U;iD,-.p!ireiit,. 
luindmni-ntHl assumptions and their consequences tend to be 


This paper introduces a new look at this whole assemblage of 
pro! ii it s, sidt eppii I iiflhtuiiii just mentiom 
following ai ti i 1 u | 

(5) Optimal Estimates and Orthogonal Projections. The 
* • 'i 1 ,~ ii i' , t t oi siew of eondx- 

i Hi t disl I us tm 'is basic fact* of 

the Wiener theorj .<■ quids > obhii d; th copi of iht « nil 
and th< lutbnt > u irapt n pjx i ...!.. It is seen tltat 
il -t ,u < ■ i i 1 *iiMand»i>cond 

order averages; no other statistical data are needed. Thus dif- 
ficulty (4) is eliminated. This method is well known in probftbii- 
«> lieory see ].) 75-78 an iSJoaofl t if)] and pp. 455- 
464 of Lo&ve (16J) but has not yet been used extensively in en- 
ds) Model* for Random Processes. Following, in particular, 
Bode and Sham h ' , itt idoro signal* *« 'Mtn-Moted 

(up to second ordej a-vei gc st itiatscal projierinis) as the output 
of a linear dynni syst i ted by independetit oi urn r re- 
lated random signals ("whit, noise' i This is a standard trick 
in ih engin ering appl < Sit The 

approach taken here differs from the conventional one only in the 
way in which linear dynamic systems are described. We shall 
emphasize, the concepts of Mate and. state transition; in other 
words, illicit! systems will In; «. { • , ih, ,j by ^M-m-, of fir.t-order 
diflerence (or differential) equations. This point of 'view is 
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Thedassical r i > » , Had y,uwi n 

■ esenUu random p e ''state I method of awlysis oj 

dynamic systems. New results are: 

if) Thejci and methods of solution of the problem apply without modifica- 
tion to stationary and nonstationary statistics and to growing-memory and infinite- 
memory filters. 

i , ,\ 

m Irix of tht into mat , t of this e<, w < 

< u its-uj tkt i , > iHimtit linear fit 

tamed , j, u »( 'a 

(3) 'Jht /it i ( 

The new null I • / /' 

i> xh'iidin . ■ , .. : 

argely self-contained arid proceeds from first principles, 


cepts of the theory of random proces, 


m>ed in die Appendix. 


Introduction 


n* important class of theoretical and practical 
iff' 1 < tistical natur 

proi iu»»r. u Pi di i ... . in «„ . 
rui , ■ goal 

* - [ it ,i t , i, i . union 

f < nit. « ik >\i n . ! 1 ii ,< ,„,!,!,, ■ . 

ii to tht: so-called wr-IIop ii«t he. 

- -.. . . . method (spectral factorisation) for the solution of 

- i • ( i . , 

«*»oo*ry statist!** and rational spectra. 
Many >. .n-nsium. i generalizations I da . , Vi-u, .an 
- ds.i, h , it „ / n solved i lt „„, „ ( , [ 
< ' tn I M u.. u i ii 

? ^ i.U i f . « ,,' m< • B.j to .< 

assed i t. . t a -! npf , .. ; i h 

.U.u now m standard tuxts 15 OJ. A -.me .,!,»• dtli, i. „- 
.pproaeh along these main lines has been given recently by 
' • < 'I i ' -it, e./ 

a tht, H j HI. 1 

' \ j>] life- i * I > 

>•,'''■ here tl eeeiii 
. a'U, u.t b«-vi, pioneered by Davis [lis) and atiolied by manv 
U l> n !. : h 1 i ', > 'i i 


■v[Ui, 


in (both ti. .>i. . 
ailed i i 

sated at the Insti 


•• It. In. i« t , n'-M .1 i , (i 1 1 it! 

ni (Wiener filter) which accomplish, 
or detection of a random signal* 


U designate Refer< 


i. Paper No. 5t> i 


11:- I ,. ; 


.1 methods lor solving II, e Wiener problem hit subject, to 
italtoiis « i , eriousi ( tin ihei p • i 
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i n opt . . 1 J | died t its imptitai spo ii 
is not a simple task to synthesize the filter from such data. 

• 1 rain ., , ,,,,*'<- 

is often quite involved arid poorly suited to machine compula- 
tion. The situation gets rapidly worse with increasing complex- 
ity of the problem. 

1 i. 1 u i . , II I 

nonstationary prediction; renuire new derivation*, frequently 
ni cunstiJemld di ( nlty lo the noiispeeialiat. 

.ill 1*H( rnathi'iuaticd ot the dciivations aio not transparent. 
Fundamental assumptions and their consequences tend to he 
obscured. 

now iooi i thi* wiwk) am I ge ol 
pruDlwiw, sidestepping the difficulties just mentioned. The 
hJl.iwsng- aiti ti.e highlights of the papce: 

(5) Optimal Estimates and 0 • , tii i'ht 

Wiener pii.hiem 1 p./int of view of condi- 

iii fn fJii» way, basic facts of 

the ft idler theory are quickly obtained; the seope of the results 
and the iimtiameiitul assumptions appear clearly, it is seen thai 
I > i . a 1 i t id i ni 

order average*; no other statistical date are needed. Thus dif- 
ficulty (4) is eliminated. This method is well known in probabil- 
ity t! s J ' >• i i - • I .ii I p, |55 
h . * i • ifiji but ha.-! not used extensiv in en- 
gineering. 

(0) Models for Random Processes. Following, in particular, 
Bode ami Shannon [3|. arbitrary random signals are represented 
(ni h -ii nd i , o| rtiet - tin out) it 

oi a linear dynai tldent or uncorre- 

cted random signals ("white noise"). This is a standard trick 
. i Wiener i v 12 7]. The 

approach taken here diiier.-i from (lie cotsventional one only in the 
way in which linear dynamic ayatemg are described. We shall 
emphasise the concepts of state and. stale transition; in other 

* i 1 1 i ui i ii .,i ; i 

difference (or differential) equations. This point of 'view ia 
MARCH 1960/ 35 


natural and also necessary in order to take advantage of the 
i 1 f f < ' 1 til > 

ij) Hohdiau ' li Problem. Will th< state-transition 
method, a singi i i< rs large variety of problems:. 

• i . t fiif t aurj reinstation 

statistics, etc.; diltiet i isappear iving guessed the 
"state" of the estimation (i.e., filtering or prediction) problem 
correctly, one is led to a nonlinear difference (or differential) 

lie r I i ce mat f I optima! estimation 
error. This is t'agu I u I > is i he Wien Ei 1 |uati 
Solution of the e<j»iation for the eovarinnce matrix starts at the 
time k when the first observation is token; at each later time t 
the solution of > ion n i unrc of the op- 

timal prediction error given observations in the interval (fe, I). 
From the covar > rix at time i t it once, without 

further > ii , s (in general, time-van 

characterising the optimal linear filter. 

(8) The Dual Problem, ih j« rmtil u of t Virol 
, Wen " i let with the growing new theory 0 

control sv«tems ha«ed on the "st if > n It 
turns out, surprisingly, that the Wiener problem is the dial of 
the noise-free o| i t 1 as been Solved 

previously by tl r, wing M it » .Hon method 1o 

great advantage [18, 23, 24]. The mathematical background of 
' H - . roi ms i? id > ! s heel pected all along, 
but until now the. analogies 'have never been made explicit. 

(ft) Applications, The power of the new method is most ap- 
parent in tiie.oi ;oti md in bimw 1 ansi = to 
complex practical pro! in t hit c: it is best to r< suit 
to machine computation l\xampie«< (.hist pe will be discussed 
later, To provide some feci for i >f two standard 
examples from nonstotionary prediction are included; in these 
eases the solution of the nonlinear difference equation mentioned 
under (7) above can he obtained even, in closed form. 

For easy refct h« main ' e du»| I in the, form of 
theorems. Only Theorems 3 and 4 are original. The next sec- 
1 fh \[ f. . I i u v i , it n ma- 

Nofation Conventions 

Throughout the, papt t 1 dl 1 iUi ducrcte (or 

sampled) dynamic systems; in other words, signals will be ob- 
served at eiin dly SJ1 ..... ,. ;„u m i itne (sampling imtanls). By 
suitable ohoh c of the tin nl eoi t intci vata iwstween 
succpsm < i • n , , y 1m? chosen 

'i ' >• referring to time, such as t, U, r, T \\\\\ 
l« i\ s b< in' p i i 

is not at all cs>. , n t , i i i i„in mg | i 

by using the discreteness, however, we can keep the. matiie- 
1 I elemental < tois will be denoted by 

small bohhfnee letters: a, b, . . ,, u, x, y, . . . A vector or more 
precisely an n-vzeior is a set of n numbers xi, . . , as.; the are the 

* or i • „ 

Matrices will be denoted by capital bold-face letters: A, B, Q, 
*, VP, . . .; they are ,„ x n array* of ' Jem lit* ti (f , 6, 7 , q if , . . . 
Tho frcmxpn.i, i iginit rows • f a in etrix will 

be lenoted by the prime. In mai ting to Ins, it wi 
convenient t. < i main . igh column. 

' ing tin m ntiomi " ,., multiplication, we 

write the scalar product of two n-veetors x, y a? 

The scalar product, is clearly a scalar, i.e., not a vector, quantity. 


Sit ly, the quadra ' ' * I h n matrix Q 

\> ne the ( xpn n xy' where x n m-veet 

fj-vector to be the m X n matrix with elements a-,?/,. , 

1 vt . ix ! f (pet i hse ol andt 

< >■ Vppendb It is usualh mvenient U > 1 < t 

after E 1 his • ot t i > < i j ■ c ^ c 

constants and the operator B commute. Thus.jExy' = sratobt 

.... I x! y' .5- hrl nl f • 

For ease of ref ' of i eipal 


Opttmai Edimofss 

I time in general; present time. 
It time at which observations start. 
xtit), zM basic random variables. 

ofwerved random variable. 

T.\*{h\t) i . d i i < . dv('l) M/a , 

f, loss function ioonraiid.ee function of if= argument). 
<r estimation error (random variable's. 

Orthogonal Protection* 

'y(t) linear manifold generated by the random varinb I 
...,9(0. 

hi jrtalpr etii > 
component of x{t t ) orthogonal to yd). 

Models for Random Proeessei 

*(« + 1 ; i) transition matrix 

Q(0 eovariance of random exei f » > ion 

Solution of the Wiener Problem 

xf'O basic random variable. 
y(0 observed random variable. 
*J/(0 linear manifold generated by ytfc), . . ., y(t). 

iifoldgetierated I I 

X*(f,|<) Opt! l > , ' 

5(/i|0 error in optimal mtimate of xtf,, j given 't|(f). 

Optimal Estimates 

To have a concrete description of the type of problems to be 
tidi < .t i I ti dean ,\i> m gi\ n <£h 

and noise a=i({). Only the sum y{t) = !-,(<> + z,U) can -o 
served. Suppose we have oliserved rotd know exactly the 
values of y(t„), . . ., y(t). What can we infer from this knowledge 
in regard to the, (unobscrvahle) value of the *i e n.i! at / « f,, where 
h may be less than, equal to, or greater than It If h < I, thus is a 
' mouthing terpolal pro! ,1, - '. this is Hei 
filtering, If ti > <, we have a prediction problem. Since our t.reat- 
ii. nf H I m t d- no i, , , 
- 'ti renftcsr the collect iv rm t. amotion. 
As was pointed out by Wiener [1 ], the natural setting of the 
estimation problem belongs to the realm of probabilit . • v 
• •ii. ignai, n 1 ill be J > lor 

iriableg d cowequ ey n ied as rs idem 

processes Fro • • , 1 v t cnpticui oi he mi ion 
pr> es - u termit t . ' biiity with which a par- 

ticular sample of the signal and noise will occur. For any civet; 
set of measured values r)(h), . . . , of the random varisbl.; y{t > 
one <an then alsoiletermim in - pK il pmbabiBtydaaaJ- 
taneotts occurrenctt of various values £,(t) of the random variable 
Xi(U). This it? ti ml.' i ilif iistribulion funclioi 
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P"Mh) £ hh(t») - v(u), ■ ■ ., y(t) = ■n(») - m) 0) 

Evidently. Fi£i) rcno-M-ms ail the information which the meas- 
r - s y{ 4 },..., y(t) has conveyed 

about iht: random variable Xi(ti). Any statistical estimate of the 
random variable x t (ti) will be some function of this distribution 
md therefore a (nonrandom) function of the random variables 
y(U\ . . -, g(f>- Tlits statistical estimate is denoted by Xi{h\l), 
or by just X1U1) or A", when the set of observed random variables 
or the time at which the estimate is required are clear from con- 

, , ' .11 n< •.. nut tin i hi 

,. (i lien A, is itself a random variable and 
its actual value is known whenever the actual values of y{k), 
. . ., y(0 are know n In general, the actual value of Xi(ti) will be 
Sm u * ' < • • < 

* it > 1 mi i saign a penalty 

i i i 'i i i i i ( i 

sith ; i . -i i ; i • . of tl estimation error 

i(h) — X t (tt i 'i i 
L(0) - 0 

Uh) 2 «*) % 0 when f, a «, 2 0 (2) 
««) - !(-«) 

Some common examples or loss functions are: L(t) — oe*, at', 

t i . boosing tl 

' i i i r-tl i (i< ni" , n i 

SiHtdh) - AiOul! - E{E{LMt s ) - Xmtvih), ■ . ., 

*COll (3) 

• e tii-t cm><c ..'mil ,.„ tl.t- weln-hand Mile of (3) does not 

n theehoi , > ml #(<>), . . y(t'/, it is cfcwi 

i>~ -i i • tit iiiumzing 

£{tN(ii) - XJJMlviU), . . ., 1/(4)} (4) 

i slight additional assumptions, optimal estimates can 

■••mm I. Assume that L is of type (2) and that the conditional 
ion function F($) defined by (I) it: 

• symmetric about the mean |: 

pa - 1) - 1 - fcf - d 

f«) fonsM/or ^ § f: 

rail + u ~ A)&) s mti) + (i - xjncfc) 

I^-JS Iks random variable u*(t,|i) which minimizes ike average loss 
(Si Si a« conditional expectation 

*.*«.!<> = #fc«i>!s<(fe), . . ., (5) 

• * As pointed out recently by Sherman [25 J, this theorem 
frtibwt, immediately from a well-known lemma in probability 

theory. 

Hilary. If the random processes , {it(0|, and fg(i)) 

»w« . , /"Acs em . hoU* 
Pku+l Bey Hmmb S,iA) {see Appends;-; }, conditional distribu- 
tions ae^a gaussiaa random process, are gaussian. Hence the re- 
t- .•Maeito of Theorem i «re always satisfied. 

la Sm control system* literature, this theorem appears some- 
ti-'-a in a form which is more restrictive in one way and more 
I -n inftther way: 


Theorem l-o. // Ut) = ««, then Theorem 1 is true without as- 
sumptions (A) and (B). 
Proof: Expand the conditional expectation (4): 

ElxMltfU), . . ., vim 

- 2X,(U)Slx l (t i )\y(t i ), . . y(t)\ + X t \t t ) 

and differentiate with respect to Xj(ii). This is not a completely 
- ) i '■,) 1 i r > ! > !> i p 

77-78. 

Remark* (a) As far as the author is aware, it is not known 
what is the mos t . i too uidom pr< | 0 ' ' 

for which th< . litionid <ii.- io itixfies the re- 

, iir iruents >f 1 he< n r. i. 

(6) Aside fro i f Shorn •),!'> 

i r been stated expli in I > ' litcratur 

t, one finds mm i t s to tl fi'ect ss f unctii 

of the general type (2) cannot be <;< idem md it' 
matically. 

(c) In the sequel, we shall be dealing mainly with vector- 
i , , riables. In Unit he estimation pn n is 

stated as: Given a vector-valued random process {x(0i «nd ob- 
served random variables y(U), , . ., fit), where y(<) = Mx(f) (M 
being a singular matrix; in other words, not all co-ordinates of 
> i I ' i ! i muif Xfd) mizes th< 

expected loss £fXC|fx<t,) - X(t,>||)|, || || being the norm of a 

'1, em 1 remui *. in the vector < . o provided we re- 
quire that the conditional distribution function of the n co-ordi- 
nates of the vector x(4), 

PrMh) St,.. x n (tO & L\yM, ■ ■ -, y(t)l = • • -, £J 

be symmetric with respect to the n variables fi — . . ^„ — I» 
and convex in the region where all of these variables are negative. 

Orthogonal Projections 

The explicit calculation of the optimal estimate as a function 
of the ob etv< i i g< ' t ' 

important exception: The processes fii«)j, \xi(t)\ aregaussian. 

On tiie other Land, if we attempt In get an optimal estimate 
under the restriction L(«) = and the additional requirement 
that the estim t , i fund o observed random 
variables, w get an est ii t i is identic with the optimal 
1 g M <, 1 1 . 1 . h ' . i u ' 

• ladrat fu >i his shows tha obi • bl 1 

linear estimation can ;>c bettered by nonlinear estimation only 
■ i it, process mm ~ i i i 1 < n ' i< n fni 

view of Theorem 5, (C» only (ii) by considering at, least third- 
i " , functions 

ment 1, the li hition of th< 
ma1 i -if i d eip oi «. 

pict rid mil. of the present section 

Consider the {real-valued) random variables y(h), . . ., y{l). 
t i I i ■ ire< birmtion i tin - randm ,-ariai witl 
i ..' '.. - l is 

J2aMi) <6) 

vector spat manifold) which « ier > '• ' > 

re < 1 i press e form (f>) as 

should not be 

Hifueed, of e< witl U ted" rai m riabl * 

Since we do not want to fix the value of t (i.e., the total number of 
possible observations), y(t) should be regarded as a finite-dimen- 
subs] of the si • possible ol vat ions 
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ivcn «iuy two \ i j 

pressibte in the form (6)), we say that u and v are orthogonal if Euv 
- 0. Usui ,11 , w ioi i edure, as de- 
scribed for instance by Doob [151, p. 151, or by Loeve [16], p. 
459, it is easy to select an orthotwrmal basis in 'y(l), By this is 
meant a set of vectors e, c , , . ., e, in <y(0 such that any vector in 
<y«) can be evpi i lique !i n,i . e t 

and 


bus (iny vector S in 'y(') is giv.ni by 

idiately defcern 


and so the coe/fi 
aidof<7): 


It follows further 


. random variable x (not a 
1 < • . ) two parte: a part £ ii 
%l) and a part £ orthogonal to Offf) (i.e., orthogonal to even 
vvrunm '})(!.})■ hi fact, we can write 


Thus S is uniquely ; 
a vector in ^(e). T 


d by 


■ , .' ' :i) and is ohvi I S 
"ore t is atoo uni< u ietarmim 
orthogonal to n/O): 
= i?(x - 2)0, = - Est, 
Now the co-ordinates f,f x with respect to the basis e io , ...,«, are 
given ftiUier in the form (as in (Si) or in the form (as 
irM 1 ')) Since the < 1 , u< *= Bxe ( (i = fe, 

.... 0: hence /<Tv, o > > , • U nal to every base vector e,- 
"" £ the orthogonal projection of z on 


1 1b " i- <n \ ! L ojc tion can i 

••li !) rnei..ri;-..Hl : f is that vector in If./) (i.e., chat Jntaar function of 
the random variables >/(/„), . . ., which minimixcH the quad- 
'— function In fnet if is an oth^r vis >i v. 


* - *K* - 2) + C* ~ wW 
ek«r in <l/(/) aa <J in particular 


AYr. 


*>' » #<* ~ *)* + >>'<* - '?>' £(* - *)» (10) 


Tii s shows that. ( ' . „| • , , kiss, ui 

: 11 <'~ - ' b *> 1 • h in ie l ' it the random variables S 
and «g are eon It ssTly for a set of events w prnba 

bility ia zero). 
These results may be summarized as follows: 


Theorem 2. Ltd \x\t)\, \u(l}\ random p 
e, EHU = - 0 for all. I). We observe y(h), . 

If either 


f 5 1 the <■■ ,, , . ! , / r t , t0 n • or 

( « ! r/te o^fm-ai edwnofe t« restricted to be a linear function of the 


**(ti|<) = opin.rl estimate of i ... i 

= orthogonal projection S(U\t) of z(t,) on ^i jy 

hesei H 1 bkcown ugh not e.-i.-ily a 
control systems literature. See Doob (15), pp. 75-78, or P«;;, i 
[26]. It is sometimes convenient to denote the orthogonal js 
jeetton by 

S(h\t) = x*(i,|i) = SWt,)|^(f)) 

in ml i , > ! 1 i t l > J tin theorem !t > 

i proc < - •« •,. ■ * - i" ,. , • i . , 

n i Utn 'i r ,t if) , f 

Proof. t) Tins • direct eon» pienee o M— "no 
necttoo with (10). 

(3,i Since Ti t), y(ij are random variables with zero mcia , ii 
clear from formula (9) that the orthogonal part Ht,\i) of x{?, w 
respect to the linear manifold "JKOfe also a random ve«. >>. •. 
zero mean. Orthogonal random variables with gero r«;e n < 
i . f tl o gaussiiu \ Theor t 

! ndependen thus 

o = Emit) = E\s(M\i)yihi . . m\. 


(<l) A ri 


rt of 


U\l) = 0 

the coptcib 


theory of Hi]!),.rt ppaee. «ee Doob 11"? end Loeve \U\\. 
(e) Thept • -.1 . titmrt of 'I }>r or. m 2 h ! rt ri 
r i? i ried aliout the nssunuifior! i > «« 

i , (t ' thai 

tii ! stiniab foi Ul Rtasonahle I« a functions. II wi . - • 
about gaussianness, even if we are resigned to com . . •■ 
linear estimates, we know that orthogonal project ioi, 

i , > .. i tim:tt > ,n onable l« functioi 9 
ill practice it is difficult to s^rtain to what ii<«re»> of a... 
tion a random process of phywral oripin i« gauscinn. it : 

. ' i r Theorei i rv broad or verv iiia ! 
nificance. 


) Thu 


e.iia!* 


:ed for 


i 1 io o i n 

y(t) generated by y(W, 


(. y-i 

of ttllmco-ordina so ear! >f U rar un vectors y(A>), 

I 1 < C Of till -'in o < , ) .,. 

(p) I }ieon i 2 stat ii pi, n ,i , stin 

conditions (A ) or (B> is a linear combination of all pre 
serrations. In other won Is. the optimal estimate ca a 
titput of a linear filter, with the input 1 • 


laliy 


ndor 


will I 


.ulse r 


Models for Random Processes 

In dealing v i 1 n< rm na it. is n< 

an empirical description but ow must have ah 
underlying cause*. Without being able to sep; 
causes and effects, i.e., without the «s- mp to 
can hardly hojx! for useful results. 
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Transactions of the A 


i i iiH ymr ih , • d Unit | ■> u y macroscopic 
i, ] ,m |>h< utii i .1 Sent gaussian pro 

A well-known example is the noise voltage produced in a 
dee to thermal agitation. In most cases, observed random 

ua are not dead I I | I U3 do ra variables. 

uAmd dejwi b • uriclaUoii} between random signals 

.1 at ilitk'icnt times is usually explained by the presence 
litiuic system 1m "fen the prist i bin source and the 
r. Th us .i rtuith.m /.utrtiau of timr mat/ be thought of at the 
of a djhumu v <> < l< I i »t independent gaussian 

snporUnt property of gaussian random signals is that they 
l ussi a at. . j i tt I system (Theon 

Assuming independent gaussian primary random sources, 
I rved random Mtrnal i- tl-o gtm.-sian, we nitty assume 
•■ avnaiiik «y«i«ra M»tfn the observer and the primary 
is linear. This conclusion may be forced on us also because 
"tailed knowledge of the atatistieal properties of the 


grater in Fig. 1 actually, t mis f< r i integrators such that the 
output of each is a state variable; f(l) indicates how the out- 
puts of the integrators are fed hack to the input* of the integra- 
tor, i - t (<ti hich tin utput of the jth 
- g or is 1 back t 1 nput of I sth u ra « f 
hard to relate formalist onvtnti nal methods of 
linear system analysis. 

If WO ttSSUI Hi ti ' - - • 1 ' " 1 

is constant duringeach sampling period, that is 


C-.d-A 


B and linear dynami 
s particularly when 


i dat 


nda mental i 

>i, uitiv. U , soini quant it 


u) d 

m of lite state. By this is 
information (a set of ntitn- 
i, etc.) which >s 'lie "east, amount of d.ets one hep 
a. niit the nest behavior of the system in order to pre- 
eeU l.rtt.ir 1 Se dv o. i.iKs is tlxm described in terms 
; i 'u.'.-, i , «.i<( no-i specify how oiie state is trans- 
te another as time passes. 

'vniiut NisU it ut i il st til it-ii m general by the 
: -ivntial equation 

dx/dl = F(0« + D(l)vU) 


y«> = M(l)*(t) 

t , ti . the components x, 

1 stale variable*); u«) is an m-vector (m S n) 
te inputs to the system; F({) and D(f) aren X n, 
X m matrices. If all coefficients of F«), D(t), 


its syst 


y. Finally, y(i)is 

; Mit) 


<m h- 


«"-38l interpretation of ( 12) has teen discussed in detf 
3,20,23]. A look at the bi t u in 1 i . 

> i iiit j, . 

,1 by the fat lines' indicating signal flow). The in< 


u(< + r) - B(0| 0 3! r < I, t = 0, 1, . . . 
then (12) can te readily transformed into the more eonv 
discrete form 

x(i + l) - *(i)x(0 + A(i)u(0; t = 0, l, . . . 

where (18, 20) 

<!>(!) = exp F = ^2 F< - /!? (F ° = " nit matrix > 


(13) 


jtually plausible 


A(l) - (J*^ exp Frrfr^ D 



iera$ linear d!«cr»1«-dynomic 

also express exp Fr in closed form using 
,od- is 1 , 20 22, 2-1 !. If u(0 satisfies (13) 
>nary, we can write analogously 


y(t) = M(I)h(0 



but of course now *(i + 1; I), Ml not te expn d in gen- 
eral in closed f< ' 1 t • " 14) »i neountered fre- 
quently also in the study of complicated sampkd-data systems 
122). See Fig. 2. 

<4>(< + 1; () is the transition matrix of the system (12) or (14). 
', i • t, ' i ui j « > tn • it i. i >ro 

time <, to time /». Evidently *(/; 0=1 = "nit matrix. If the 
system is stationar f.h n*« - i; 0 - *(* + 1 - 0 « 
*(!) - const. Note also the product rule; '!'((; r) - 

#(/,- r) and the inverse rule *-'(<; «) = *(«; 0. where *« r are 
integers In a t » >-stem, 4>(<; r) — exp F(t — r). 

As a result o! h pi eodi t is itssiou m *l tii represent ran- 
dom phenomena by the model 


t (i -f i) - *« + 1; t)x(0 + «<0 


(15) 


i * i i ti ins. »t t, i n ut < , i 

tt.ti e thought « 1 r tion of v< i 

random effects; under very general conditions, such ag- 
1 . ► a , i b n !. t . au e , j 
• .- . . j ( ti , Pl.t u ut j r . ; H I. j , net, , . , 

I . , J at n t t i i 

1 pit e i i . , h ' i » . t 
tfeas primary random sources would appear to he independent 
nrdOMeopie time nctio. , a . . k . 


>vahsed, independent, gaussian random 
, which is completely descrited by (in 


Eu(t) - 0 for all I; 
j?u(«)u'(*> = 0 if i/» 

JSu«)tt'«) - Q(0. 

Of course (Theorem 5 (.4)), st(0 is then also a gaussian random 
process with zero mean, but it is no longer independent. In fact, 
i [ 'it ids assuming it i 

torn), in other words, if wo neglect the initial state x(fc), then 
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I-MDx'd) - J] *«; r + l)Q{r)*'(»; r + 1). 

Thug if we assume a linear dynamic model and know the statistical 
properties of the gaussian random excitation, it is easy to find the 

r " " " ' t , the gatwsiH: idom i 

ess {x(f)}. 

In real life, however, the situation is usually reversed. One is 
given the covariance matrix Bx(l)*'(s) (or rather, one attempts to 
" ! ' ' ' ' f n limited tisti '1 i ) und the prob- 
ei ( 1 , il properties of u Phis is a 

subili uidpree ntl.i larg !, msolved proble i in experimentation 
and data reduction. As in the vast majority of. the engineering 
literature on the Wiener problem, we shall find i ' ni n( ' i 
'< • •' icmod 1 ni regard the prol i • 1 < i t r ng ' < 
model itself 11 t i,„ n To be sure, the two problems 

' >""' I ! ' i the author is not aware, 
howf ; mysl si\y , < • - ti , m , ,,,> j, « 

In summary i> ! 1 > „ j f )ns are made about random 
processes: 

Physical random phenomena may be thought of as due (o primary 
random mm .1 , , 0 , , r 

assumed la he , , . 7 , , _ , 

>-»»•> > , , r, , r „hm ptorwes 

are therefore d, , The question of hmo 

tho numbers specifying the model are obtained from experimental 
linti I not mid, red. 

Solution of the Wiener Problem 

Let us now deli m of the ] 

Pmbl«m I. Consider the dynamic model 

+ 1) =*U f 1; <)x(0 + u(0 ( 16 ) 

y(0 - M(i)«( ( ) (i7) 


where ait) is an independent gaussian random process of n-vectors 
with zero mean, x(l) is an n-veclor, y«) is a p. v(X U>r (p & «), 
%, resp, p x n, matrices whose elements 


<Ht + i; I), M(<)a 
are nonrandam functions of time. 

Owen the observed values of y(t„), . . ., y(t) find an estimate 
**(t t \l) of x(i,) which minimizes the expected loss. (See Pig. 2, 


where A(i) = 
This problem 


pr.ibi. 


hrom -ntmrvm 2-a we know that the solution of Problem I is 
simply the orthogonal projection of x«,) on the linear manifold 
t|(0 generated by the observed random variables. As remarked 
in the introduction, this is to be accomplished by means of a 
' ' ' 1 ' i '(not the general 

form (14). With this in mind, we proceed as follows. 

\" im that yi/ 0 y ve been measured, i.e., tha 

- 1) is known Next tu he random • U>l y is 
measured. As before let y(t\t ~ 1) be the component of y(l) 
orthogonal^,/ _ ) U t = „, which men that 

v dues ,f ,11 < m , „ m , .tHofti , ^rtorai'ezefo for almost 

ev-rypossil . .*>,,-, gobvim the ameaa i 
mdu ' ' 1,1 ' r i V loes corn my add 

' *' >" »>. This is notlikeb happen in a pi 
meaningful situation. In any ease, y - 1) generates a linear 


manifold (possibly 0) which we denote by Z(«). B* 
I) and Z(<) taken together are the same man!' 
and every vector in Z(l) is orthogonal to every veetej 
Assuming by induction that x*«, - l|f j 8 kiv 
write: 

x*(i,|<) - = J2WW|T}« - i)l + 


frOH-l; -JJx*^ -it - i) 


where the last line is obtained using (16). 

Let t, = t + s, where s is any integer. If « J» 0, t!> 
is independent of <y(« - 1). This is because »({, - 
« — 1) is then independent of u(l - 2), u(f - 3), . . . a- 
by (16-17), independent of y(U), . . ., y(t - I), hence ; 
of - i ). Since, for all I, u(fc) has zero mes n by b • 
follows that u«, - 1) (s fe 0) is orthogonal to y<J ~~ 
! it = e n i t i t» r • t nd side of (18* 
* < 0, considerable complications result in evalunfm 
\V hall consider onl> th< ease i, ». Purtherr 
suffice to consider in detail only the rase i, = t + 
other cases can be easily reduced to t his one. 

The last term in (18) must be a linear operation .•>.. 
var ble y(l\t I < 

1)|Z(0] = A*{tW\t - 1) 

where A*«) is an n X p matrix, and the star refers 
filtering." 


y(t\t - 1) = y(0 - y«Ji - 1) - y(i) - M(0«*f 
Combining (iS-20) (see ;> s v,e obtain 

' x*(« + l|0 = <l»*(f + 1; i)x*(!i« - I) + A 


t) » *(< + 1 ; 


Thus "optimal estimation is performed by a )m< 
system of the same form as (14). The state of the 
the previous estimate, the input is the last measured 

1 sen 1 * ran lorn variable y(< t.l ' ri ion i 
(22 .Not i, t i of the optimal! 

the model of U random process 0?) the op 
The estimation error is ul<o governed by a line 
system.. In fact, 

l|<) - x(t + 1) - x'tt + I jO 

- 4»« + 1; 0*«) + ««) - **(! + 1; f 
-A*(0M«3x«) 
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Fig. 3 Matrix block diagram of optimsl Miff 

Transactions if * 


- *»(< + u tm\t - D + «(0 (2 s ) 

T5 >ts** is also the transition matrix of the linear dynamic system 
ginning the «rror. 
From (23) we obtain at once a recursion relation for the co- 
, hum 1 ' C rori«|i - 1). Noting 

V - t < :) is independent of x«) and therefore of x(l\t - 1), we get 

r . i i; - fit' ) i|o;'(t + i!o 

j&« #»« + i; <)K5(/|i - - m*'(t + l; 0 + OH) 

~ **({ + 1; i)£S(«|! - Di'<*|< - W(t + 1; 0 + Q{<) 

• ; ; - **{* + 1; «)P*(0«I>'« + 1; 0 + Q(0 C 24 ) 

shore Q«) - Euitwm- , , . 

. i-ere remains tite problem iA obtaining an explicit formula for 
.. and thus also for**). Since, 

;<< + - *u + 1) - £W« + D|Z«)I 

L,.irih«g<>u«ltoy(<!t - 1), it follows by (19) that 
0 - K[*(t + 1) - A*(f)y(i|( - l)]y'(f|< - 1) 
= £'xC< + l)y'# - 1) - A*tt)*y(t|< - 1 #'(»!< - «• 
Holing that x{< + l|< - 1) is orthogonal to %(*), the definition of 
p,; f ) given earlier, and (17), it follows further 
0 = Ei{t +l\t- l)y'«|< - 1) - A*(«)MO)P*(t)M'(0 

+ i; t)Ht\t - i) + -d\t - - *> M '«> 

- A*(i)M(OP*(OM'«). 

Finally, since u(i) is independent of x«), 

I) - *(f + 1; t)P*(OM'(0 - A*(<)M(OP*(OM'(i). 

m tin t. n\t ' ' h • t " '' l " l< 1 8 

P L , , . i ,. I >v»m>d th.it UOW 

ui Uu; rows of M(i) are linearly dependent at any time, in other 
wwls, that none of the observed scalar random variables y t (t), 
• ■ •, VmW is a linear combination of the others. Under these 
i-i, . > we K . t h., dh 


**<« + s\l) - £[x(t + ajNW) 

- £t*« + t + Dx(t + 1) 'mm 

-•(« + *; < + i)x*(« + i|0 C gfel > 
If * < 0, the results are similar, but x*tf - * |0 win have | 
(1 — s)(n — p) co-ordinates. 
The results of tins section may be summarised as follows: 
Theorem 3. (Solutimfif the Wiener Problem) 

Problem I e optimal esU tex*(t l\t) <<i 
given y(fe), . . !, y(0 « generated by the linear dynamic .system 

x *« + _ **« + 1; t)x*(l'i - 1) + A*«)yO) (' 
7 Ae es« wis >« error s f'j 

• ;(i •+ i|t) = *♦« + l; 0x(4* - i) + "(^ 

: matrix of the estimation error is 

i) = Em - - i) - p*«> 


£,•(.<) =*(«+ 1; <)P*(i)M'«)[M(0P*(0M'(0]- 1 


cov x{? [f - 
ffce expected quadratic 


(23) 


(26) 


£ ]?f< 5 (t|« - ') - trace P*(0 


T** maMce. A *«),**« + 1 ; 0, P*W «^ grated % »e r< 
r«(«/j<3n» 

A*«) - <f{t f 1; /)P*(<)M'(/)1M(0P*(/)M'(0)- 1 
**(t +1; 0 -*(< + 1; 0 - A*(()M(i) 
p» u + t ) = + i; i)P*a)*'(i + l; 0 


In order to carrj/ out t/ie 
P*(/„) of x(h) and the c 
s g 0, if * is ini»rti6Je 

x*(< + «!<) = *(< + *; 

= «I>(t + s; 


one must specify the covariance 
Q«) o/ u(t). /'inatt^, /or 


X x*(i +'#-»• 


(25) 


.Since observations start at f 0| x(fe|t 8 - 1) = x(fc); to begin the 
rttvt! • , • ■ P , - uatioi 

, us > . ' P* ' I < ' x v niiig II r » 

-,vt II,.' • ii - • "I i' : ■ • M« ti"" 

:t. + 1; to), and cqu , P*f; I nplel ng th 

Tf n a Q ' -potitw ' ' ' <!l " * r ' ! " 
.tive definite and the requirements in deriving (25) will 
i ■ satisfied at each step. 
how¥ i i on that h = t + 1. Since ««) is 

•hogonat to 'y(i), we have 

,<4-!;i)= + i; OxW + uWlWO] -4Kt + i; 0«*(i|0 
ave n'4»(i + I; t) has an inverse I + 1) (which is always 
ease- when * is the transition matrix of a dynamic system 
l^tfaQrifoahia by » differential equation) we have 

x*(i\t) - «!»«; I + l)x*(i ~ H 1 

i i, £ ' f- 1, we fin* observe by repeated application of ( 1 6) that 
, t r „ » + { + l)x(t + 1) 

+ + + ')"« + r) (« fe 1) 


-1),. 


., u(l + I) are all orthogonal to T|(0, 


+ + *; < + 1)A*(«)y(« 
so thai the estimate x*(t + s\l) (> g 0) is «!.« ffifen % « ?I '»* ar ' 

R 9 mark.. (k) Eliminating A* and** from (28-3(_ J ), a nonhnea 
difference equation is obtained for P*(i): 
p* ( i + i} _*( t+ i; 0{P*W - P*(t)M'(0IM(t/P*«M'(/)!- 
X P*«)M«)}*'(i +!;<) + CK0 (I ^ ^ < 32 
This equation is linear only if M(i) is inverUble but i.ben the 
problem is trivial since all components of the random vector x(0 
are observable P*« + 1) = Q(l). Observe that equation (."^ 
plays a role in the present theory analogous to that of the Wiener- 
Hopf equation in the conventional theory, 

( hice P*(t) has been computed via (32) starting at t = k, the 

plii - ifi ' ' ' tir Hneai er i nmediat* 

available from formulas (26-30). Of course, the solution o 
Equation (32), or of ita differential-equation equivalent, is i 
much simpler la k than solution of the Wienor-Ilopf equation 

(;f) Tiie results stated in Theorem 3 do not resolve completely 
Problem I. Little has been said, for instance, about the physical 
significance of the assumptions needed to obtain equation (25), 
the converger. . , lability of the nonlinear difference equa- 
tion (32), the stability of the optimal filter (21), etc. This c; 
actually be done in a completely satisfactory way, but must be 
left, to & future paper. In tlife connection, the principal guide i 
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tool turns out to be the dun theorem m« icd br y in tl 
next section. See [29). 

(J) By kit n,. i sampling j tl to - } 

J H.l I 71 ' , th t 1 , , , 

a differential equation for the optimal filter. To do this, i.e., to 
pass from equ n»H ( res computing 

■ 1 t U F* 1 irix<I»* ' i h . be done only if 
#• is nonsingiil I I I the ease Ibis 

is because it is sufficient for the optimal fitter to have n-p state 
variable*, rather than n is tl wmnliam of quation (22) would 
seem to imply. By appropr U; todifi lions, herefore etjua- 
t ot! (2Ti can duett u j • , i i i , , - 

tions whost trai ttnx in it 1 - 1 i t , 

viii bf overed n bier p ideations. 

(A) The dynamic • system (21) is, in general, nortstalionary. 
Tl is i d i tou i thin* ( ! ) The t . , , , <f < \ , 
and M(t); (2) the fact that the estimation starts at t = U and 
improves as nit , < mlatcd If #, M are constants, 

it e it beshnwu '< < pj nnmv syst m in 

the limit t -* «> . This ig y ie cafM . treated by the classical Wiener 
theory. 

<0 ' 1 u, "bi. In , s dvahnv, not mi. ;< I 

by the nonslationarily of the model for x(0 or the finiteiiess of 
available data. In fact, m fut as 'the author is aware, the only 
explicit recursion relations given before for the growing-memory 
filter are due to Blum (121. However, his results arc much more 
complicated than ours. 

(m) By inspection of bi K . '4 vvi» see that the optimal filter is a 
feedback system, .and that tie- signal after the first summer is 
white noise since y(l\t - 1 ) i« uin 

process. This corresponds to «omc well-kmn j, i« nhs i \\ ,cm 
filtering, see. • » Muiii N, . • p 1 j (lowest r 

this t- ippu, nth -bi fit, 1 ,„,„ >,, p,, f< , v M \, , , ij[ , 
is realizable by means of a feedback system. Moreover, it will be 
sho.m ui tm. h< • p-t , h ' >- h , hi*. - ts . v ,,s <rf/>M<?, undei 
vc.rv mild assumptions on the m( ,de| (10-17). See [2ft). 

The Pual Problem 

Let u s now consider another problem which is conceptually 
very dif'/erent from optimal estin 
regulate to t 1 
Problem' Ih Consider the dynamic system 

*/< + I) = 4>(t + 1; t)x(t) + M(t)u(t) (33) 
w/iere x(/) ;? ! rn ( WJ g n), 4>, M are 

n X w resp. *n X m matrices whose elements ore nonrandem func- 
tions of time. : Given any stale x{£) at iiim t, we are it, find a sequence 
a(t), ■ ■ ., u(~ T) of control vectors which minimizes the performance 

F(x(01 = 22 x'(t)Q(t)x(t) 

' where CtU, t- n > > , 

dom functions of time.. See Fig. 2, where A = M and M = i. 

t i t ,i n Problem 11; it is 

implicitly asauin 1 i it eveiy si t rbble < ti b< measure.! 
exactly at each instant t, I + }, . . 7'. 'It is oustomatv to call 
'/' 25 t the termu , ' be infinite 

The first gem • Uoli ic noise- trulu tor problem is 

due to the author (18). The main result is that the optimal .-o-.- 
trol veelor« n*< f . , t . . , \ft 

a < h tnge tri not i fl , Appendix, Hi fcrenee 

..... 

u*(t) - - &*(t) s {t) (34) 

Under optimal control as given by (34), the "closed-loop" equa- 

•' f for tlm system are (see Fig ^) 


*(t + 1) -= 4*{t + 1; <)*(<) 
and the minimum performance index at time I is go; , in 

vim) - *'(f)p*(* - 

The matrices A *(*),«£*{*+ 1; 0, ^*(0 are detetiv«iiied I 
recursion relations: 

A*(«) - iM'<<)H0r^t)j~'M'WHn4»(< + 1; t) 
«§•*« + 1; 0 = 4kt + l; t) - Mit)£*(t) 

1ST 

P*(l — I) = ii>'(t + i; (#»«)**« + 1; 0 

+ 6«) , 

Initially we must set P*(T) - 6(T + 1). 



l— HZH— - 

Fig. 4 Motfix block diagram of optimal confr 

Comparing equations (35-37) with (28-30) and 1% 3 wi 
4 we notice some Interesting thing? which .arc ex p„„ . ,,j ,.- 
by 

Theorem 4. (Duality Theorem) PrMem 1 and /V 
liiye/.s e/e.jcA of/icr »» the JottmmtQ mm: 

Lei r i 0. /tepface ecen/ mofe X(/) = Xtt* 4 ,-) hi 
b» X'(0 • X'(?' ~ r, 7-*rn « e " 15-37) 
?>/«c« «wry »!ff/rw X(y — r) in (35- 1.17) '.-/ X'(/„ d , ;. 7 
/km (28-30). 

Proof. Carry out the futbstii ul ions, bur ease of n ferc. 
"I ' ' problen given in def T in 


.. Problem I Problem II 

1 x(<) (imobservable! st,..t,, „(/} (observable ) 

- iriabie: ot rndfim prop- bles of plant to b 

ess. lated. 

2 y(<) observed random vatia- a(t) control vat ial l< - 

bles. 

3 f« first observation. 7' last control at (ion. 

4 *{<?+ r+1; t„+ rjtratisi- <1*»(T - r+ 1; i — t 

5 P*(/ 0 + r)eovarianeeof opti- P*f T — r) mtibit <>• 

raised estimation error. ratic fonn foi o- .f,. 

mdex under tifimt: 

6 A*(t» + t) weighting of ob- A*(f - T ) weip'-'ine 

f=ei * o - i > .timat esti- for optima' c. 

7 **((, + r + 1 ; i, + t) transi- 4**(T - r +• 1 ; 7 - r 

< fe.i mal iiori raatri 

timation error. regulation. 

8 M(io + r) effect of state on M(T — r) e.lTrv i of 

observation. vectors on sl'\o>, 

9 Q(h + r) covariance of ran- 6(r - r ) matT f m 

dom excitation. form de finis.* err 

Remark*, (n) The mathematical significance of th" due 
tween Problem I and Problem II is that both probl; rot? r> 
the solution of the W iener-Hopf-!ike equation (32). 

(o) ThBifkmcal RignifieflJ r>f tlwdtwlily isint* nki 

r i rvations and I 1 <j i < 
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i research [29 1 has shown tliat the essence of the Duality 
• les in the duality of constraints at the output (repre- 
v the matrix &(t i Pi ! I) *i eon rainta it ! 

ell f dbi he I trix M n Problem II) 
ppBektions of Wiener's methods to the solution of noise- 
■Ufajr problem have been known for a long time; see the 
took < >>,,.. ioi i Kaiser [27| lowever, 
«»Bbns between the two problems, and in particular the 
haw apparently never been stated precisely before, 
ne duality theorem offers a powerful tool for developing 
ply the theory (as opposed to the computation) of Wiener 
.» mentioned in Remark {»'). This will be published 
.»[*>]■ 

~wer of the new yprouch to t Wiener problem, as ex~ 

fay Theorem 3, is most obvious when the data of the 
. are given ia numerical form. In that case, one simply 
the n merieal consj utni on , require 1 by (28-30). Ite- 

- «n i , ulati i in >me < practical < rinei 

will b<; published elsewhere. 
.1 the answers are desired in closed analytic form, the itera- 
3-M) may lead to very unwieldy expressions. In a few 
v* snd«l»* can be put into "closed form." Without dis- 
hare how (if at all tclu-lo d forms . b obi Ined >' 
two examples ii * csulta to \ 

,Ui i. Consi ' • ( ■ . oned un Ier "Optimal 

its," Let 1*0 the signal and scs(0 the noise. We as- 


Hi + U - «.«> 


very small weight to the measurement y>(l}; this is what one 
would expect when the measured data are very noisy. 

In any case, x,*(lk - 1) = 0 at all times; one cannot predict 
independent noise! ' This means that <j>*n can be set equal to 
zero. The optimal predictor is a first-order dynamic system. 
See Remark (j). 

To find the stationary Wiener filter, let t = » on both sides of 
(38), solve the resulting quadratic equation in C(«), -etc. 

Exampl* % A. number of particles leave the origin at time = 0 
with random velocities; after t = 0, each particle moves with a 
constant (unknown) velocity. Suppose that the position of one 
of these particles is measured, the data being contaminated by 
stationary, additive, correlated noise. What is the optimal 
estimate of the position and velocity of the particle at the time 
of the last measurement? 

Let Xi(i) be the position and x*i) the velocity of the particle; 
xM is the noise. The problem is then represented by the model, 

Xt(t + 1) - xM + xjj.) 

x t (t + 1) - m) 

*,« + i) - 4>m(t + 1; tMt) + «,«) 
y,<0 - *i(0 + MO 


and the additional conditions 

1 fi = t; k = 0 

2 Ux, s (0) - «x,(0) = 0, BxtW) = « 2 > 0; 

3 BMO = 0, BuAD = h\ 

4 4>n{l + 1; 0 =» <fhi = const. 

According to Theorem 3, %*{t\t) is calculated using the dynamic 
system (SI). 

3 j s j w < so! < i ' > ting thi rasition and ve- 

,i | 1 1 id Simple considerations 

show that 


P*(D 


rv «« oi ron 

= a J a' 0 J and A*(0) = I OJ 


, = t+ I; t, - 0 
• .-, J (0) = 0, i.e., i,(0) = 0 

5'mHO = «*. = f/», !-:>h{t)ih(t) - 0 (for all i) 

<M< + 1; r) - &i - const. 

It- , * ' • 1 i i t t f 

.fcrence equations (28-30), for alii g 

*•«- m 

iceroiras assumed that Xi(0) = 0, neither 1 ) nor £•>( 1 ) can 
/«Sefcd ftom the measurement of y t (0). Hence the meas- 
es* at tin* t = 0 is useless, which shows that we should set 
.> This fa.d »ith the iterations (38), completely deter- 
a ti^fpnetion C(i). The nonlinear difference equation (38) 
tteroleof the Wicner-Ilopf equation. 

6V**?C;li then C(<> m 1 which is essentially pure predic- 
1, then C(l) S 0, and we depend mainly on 
. -• 1} for the es>>Tiation of *,*(* + and assign only 


It is then easy to cheek by substitution into equations (28-30) 
that 


f t -Mt - D | 

1-4hAi - l) - l) 4*Ht - i)' + <Mt - DJ 

is the correct expression for the covariance matrix of the predic- 
tion error x({ jf - 1) for all I & 1, provided that we define 
C,(0) = M/a> 

C,(0 - C,(l - 1) + [t - <fr*(t - dp. ' & 1 
It is interesting to note that the results just obtained are valid 
also when <t> n depends on t. This is true also in Example 1. In 
conventional treatments of such problems there seems to be an 
essential difference between the cases of stationary and nonsta- 
tionary noise. This misleading impression- created by the con- 
ventional theory is due to the very special methods used in solving 
the Wiener-Hopf equation. 
Introducing the abbreviation 
C,(0) - 0 

C%(t) - < - <M< - 1), t i 
and observing that 
eov Hi + 1 10 - P*(t + 1) 

= *(i + l j OEcov i(t \t)W(t +!;<)+• Q(0 


Basic jfipeerinf 


MARCH 196 0 / 43 


the matrices occurring in equation (31) and the covariaace 
matrix of x(t \t) are found after simple calculations. We have 
foralljfeO, 


am lcm 


' cm - tow 
-cm 

-CM +tCt(t) 


(■,(/; 


cm - tcrfi) 

Ci(l) - cko 
-C,«) + iC s (0 


- <t>ncm 


C,(t) 


gain some insight into the behavior of this system, let u 
limiting case I -* » of a large number of observa 
Then Cj(4) obej >roxima . • lifi , < > 

dCm/dt £s &»(<) 


CM s 


(3!)} 


is the numbc 


almost exclusively c 


s becomes large, we depend 

- v-.-, < * > r- !- ,r- { 

t\i f i) and 2,»(iS + lit 4- I). Current observations are used 
almost exclusively to estimate the noise 


x a *«|<) S yM - ***(<!<) «>1) 

of course expect something like this si- 
»«f to fitting a straight line to an inert 


sing number 

As a second cheek on the reasonableness of the result.-- jnven 
observe that the ease (>li» essentially the same m prediction' 
based on cm i 0 „ iia 


_ 0) 


which is identical with the result obtained by Shinbrot [111 
Example I, and Solodovnikov (14], Example 2, in their treaU 
menf, of the Wiener problem in the finite-length, continuous-data 
case, using in ai i , om ;)UJH 

Conclusions 

rhis " «• formuliifes n . s the Wiener problem from the 
'" " >» «'f i « dt , 1 , " t , I o ,, . 

' f ' ' ca fie. if), ■ en u 

tacked by other methods. On th« he 

vu *'< 1 < ■ eted witho 5 lorn* in 11 

theory of control. Much remains to be done to exploit these 
connections, 
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APPENDIX 
RANDOM PROCESSES: BASIC CONCEPTS 

f ^sWiiraMse of the reader, we review here some elementary 
u«i md (mi < »biiity and random processes 

. i i b. utmost possible simplicity; 
letter depth and breadth, consult Laning and Battin (5 J or 

■» RSI- . , 

i-OMifew portable m a function whose values depend on the out- 
- of ft chance event. The mines of a random variable may be 
.Moment mathematical entities; real or complex numbers, 
i et For simplicity, we sha itsid ' real- 

{ i variabilis, but tins is uo n r > • 

i o i b t«db: ud u ii • Uui - k ; f 

pwa«ct • 1 "i uot of n«i Ion v ri blea are also ran- 

variabW's. 

random variable a can be explicitly defined by stating the 
.ability that x is less than or equal to some real constant £, 
i ia«rpre«sed symbolically by writing 

M* 5 f) = Fm *,(--) = 0. *",(+»> - 1 
•> is called the probability distribution function, of the random 
different iabi i respect to ! i 
) - fa called the probability density function of x. 

to espeefcd rof»« (mathematical expectation., statistical average, 
mbh average, mean, etc., are commonly used synonyms) of 
ii.i r i % i is defined b> 

x) = = J""_ rtfjrff.tf) - g(VUm <*>) 

indicated, it is often convenient to omit the brackets after 
symbol E. A sequence of random variablea (finite or infinite) 

{x(i)l - .... x(-l), x(0), X (l),... (41) 
ailed a discrete (or discrete-parameter) random (or stochastic) 
cess. One particular set of observed values of the random 
cess (41) 

....«-», «0), «!),... 

£ ( realixtiio * » I mdion) i > , a. In- 

elj t rand* ' i f random 

ieh are indexed it such w t ion of time in! 

! picture. 

V random process is uneorrelated if 

Ex(l)x(s) = Ex(t)Ex(s) « ^ «) 
furthermore, 

£K<)x(«) = 0 « ^ t) 

i i ml. Any imcorrelated random, 

ocess can be changed into orthogonal random process by re- 
acing x(l) by x'U) = x(t) - Ex(t) since then 

*'«)*'(«) = /:[r(0 - &(0W*(*) - Ex(s)} 

= &(iWs) - Ex(t)Ex(s) - 0 

is useful to remember that, if a random process ia orthogonal, 

{i(h) + x(h) + ..,]»= ExHti) + Ex*(h) + ...(h^h ■*...) 

«»a etor-s 1 random riable w >m >n< nts >, .. 

, (which are of course random variables), the matrix 

S(x t - BiiX*, - - B(« - £»X«' ~ 

= cm? x (42) 

j called the aworionce matrix; of r. 


A random process may be specified explicitly by stating the 
probability of simultaneous occurrence of any finite number of 
events of the type 

a(<i) i &, ■ . .. asfO 'i Li (h * . ■ ■ * Q, Lfc, 

Prmh) A ?„.. -,*«.> S - ,.<«(&, • ■ - f.) < 43 > 

where F, ((l ) *«»> is called the joint probability distribution 

function of the random variables x(d), . . ., x(l K ). The ,/orn/ 
probability density function is then 

. /.<* <w & • • W - av-w *«/*6. . . of, 

provided the required derivatives exist. The expected value 
Eg\x{h) f . . x(t n )\ of any nonrandom function of n random varia- 
bles is defined by an n-fold integral analogous to (40), 

A random process is independent if for any finite h ■ . ■ 7 s l n> 
(43) is equal to the product of the first-order distributions 

Pr[x(U) *£.]... Pr[x(Q £ U 
If a set of random variables is independent, then they are obvi- 
ously also uneorrelated. The converse is not true in general F or 
a set of more than 2 random variables to be independent, it is not 
sufficient that any pair of random variables be independent. 

Frequently it is of interest to consider the probability distribu- 
tion of a random variable x(l„ +t ) of a random process given the 
actual values f(fi), . . ., f(Q with which the random variables 
x(t t ),..., x(t.) have occurred. This is denoted by 

PrMt*n) £ ., x(ij - U 

V^Um ^.)'^. 

_ lz» , — (44) 

/,if 1 >,...,»«n)(tl J ■ • tn) 

which is called the conditional probability distribution function of 
*(<»+i) given x{k), . . ., x(t n ). The conditional expectation 

«UW4 + .))W<.), • ■ 

is defined analogously to (40). The conditional expectation is a 
random variable; it follows that 

F4E\9MUt)]Hh), ■ ■ ., x(0] ] = E{g[x(t,»))} 

In all cases of interest in this paper, integrals of the type (40) 
or (44) need never be evaluated explicitly; only the concept of the 
• ■ pectf 1 value is needed. 

A random variable z ia gouswYm (or normally distributed) if 

i r t (j - E xf I 

fM) ~ frrE(x - Ez)«]'/. eX P L 2 E{x - Ex)*] 

which is the w. I-kr - ' hape< r\ Simikrl n ra dom 
vector x ia gaussian if 

«*> - i^ldetC)^ [- J « - ^>' C ^ - *>] 

where C _1 is the inverse of tha covarianee matrix (42) of x. A 
gaussian random process ia defined similarly. 

The importance of gaussian random variables and processes is 
largely due to the following facts: 

Th«.r*m 5. (A) Linear functions (and therefore conditional ex- 
pectations) on a gaussian random process are gaussian random 
variables. 

(if) Orthogonal gaussian random variables are independent. 

(C) Given any random process with means Exit) and covariances 
Ex(l)x(s), there exists a unique gaussian random process with the 
earns means and covariances. 
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